Research in IR puts a strong focus on evaluation, with many past and ongoing evaluation campaigns. However, most evaluations utilize offline experiments with single queries only, while most IR applications are interactive, with multiple queries in a session. Moreover, context (e.g., time, location, access device, task) is rarely considered. Finally, the large variance of search topic difficulty make performance prediction especially hard.
Several types of prediction may be relevant in IR. One case is that we have a system and a collection and we would like to know what happens when we move to a new collection, keeping the same kind of task. In another case, we have a system, a collection, and a kind of task, and we move to a new kind of task. A further case is when collections are fluid, and the task must be supported over changing data.
Current approaches to evaluation mean that predictability can be poor, in particular:
Perhaps the most significant issue is the gap between offline and online evaluation. Correlations between system performance, user behavior, and user satisfaction are not well understood, and offline predictions of changes in user satisfaction continue to be poor because the mapping from metrics to user perceptions and experiences is not well understood.
Submission deadline: July 9, 2018, extended to July 16, 2018
Notification of acceptance: July 30, 2018, moved to August 10, 2018
Camera ready: August 27, 2018
Workshop day: October 22, 2018
Conference days: October 23-26, 2018
General areas of interests include, but are not limited to, the following topics:
Papers should be formatted according to the ACM SIG Proceedings Template.
Beyond research papers (4-6 pages), we will solicit short (1 page) position papers from interested participants.
Papers will be peer-reviewed by members of the program committee through double-blind peer review, i.e. authors must be anonymized. Selection will be based on originality, clarity, and technical quality. Papers should be submitted in PDF format to the following address:
https://easychair.org/conferences/?conf=glare2018
Accepted papers will be published online as a volume of the CEUR-WS proceeding series.
Ian Soboroff, National Institute of Standards and Technology (NIST), USA, ian.soboroffnist.gov
Nicola Ferro, University of Padua, Italy ferrodei.unipd.it
Norbert Fuhr, University of Duisburg-Essen, Germany norbert.fuhruni-due.de
Justin Zobel
School of Computing & Information Systems, University of Melbourne, Australia
Bio
Professor Justin Zobel is a Redmond Barry Distinguished Professor at the University of Melbourne in the School of Computing & Information Systems, and is currently the university’s Pro-Vice Chancellor (Graduate & International Research). He received his PhD from Melbourne in 1991, and worked at RMIT until he returned to Melbourne in the late 2000s, where until recently he was Head of his School. In the research community, Professor Zobel is best known for his role in the development of algorithms for efficient web search, and also is known for research on measurement, bioinformatics, and fundamental algorithms. He is the author of three texts on graduate study and research methods, and has held a range of roles in the national and international computer science community.
Proxies and Decoys: Assumptions, Approximations, and Artefacts in Measurement of Search Systems
Research in information retrieval depends on the ability to undertake repeatable, robust measurements of search systems. Over several decades, the academic community has created measurement tools and measurement practices that are now widely accepted and used. However, these tools and practices not only have known flaws and shortcomings but remain imperfectly understood. This talk examines measurement in IR from the perspective of inconsistencies between quantitative measures of performance and the qualitative goals of IR research, and considers whether some of the shortcomings in measurement and predictivity arise from assumptions made for the purpose of producing standardised metrics. These issues suggest challenges to be addressed if measurement is to continue to support research that is enduring and defensible.
The Challenges of Moving from Web to Voice in Product Search
Offline vs. Online Evaluation in Voice Product Search
Causality, prediction and improvements that (don’t) add up
Towards a Basic Principle for Ranking Effectiveness Prediction without Human Assessments: A Preliminary Study
Novel Query Performance Predictors and their Correlations for Medical Applications
Report on the Dagstuhl Perspectives Workshop 17442 - Towards Cross-Domain Performance Modeling and Prediction: IR/RecSys/NLP
Discussion